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Abstract 

Dual-tree algorithms are a widely used class 
of branch-and-bound algorithms. Unfortu- 
nately, developing dual-tree algorithms for 
use with different trees and problems is often 
complex and burdensome. We introduce a 
four-part logical split: the tree, the traversal, 
the point-to-point base case, and the pruning 
rule. We provide a meta- algorithm which al- 
lows development of dual-tree algorithms in 
a tree-independent manner and easy exten- 
sion to entirely new types of trees. Repre- 
sentations are provided for five common al- 
gorithms; for /c-nearest neighbor search, this 
leads to a novel, tighter pruning bound. The 
meta-algorithm also allows straightforward 
extensions to massively parallel settings. 

1. Introduction 

In large-scale machine learning applications, algorith- 
mic scalability is paramount. Hence, much study has 
been put into fast algorithms for machine learning 
tasks. One commonly-used approach is to build trees 
on data and then use branch-and-bound algorithms to 
minimize runtime. A popular example of a branch- 
and-bound algorithm is the use of the trees for near- 
est neighbor search, pioneered by Bentley (1975), and 
subsequently modified to use two trees ("dual-tree") 
(Gray & Moore, 2001). Later, an optimized tree struc- 
ture, the cover tree, was designed (Beygelzimer et al., 
2006), giving provably linear scaling in the number of 
queries (Ram et al., 2009) — a significant improvement 
over the quadratically-scaling brute-force algorithm. 

Asymptotic speed gains as dramatic as described 
above are common for dual-tree branch-and-bound al- 
gorithms. These types of algorithms can be applied 
to a class of problems referred to as 'n-body prob- 
lems' (Gray & Moore, 2001). The n-point correla- 
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tion, important in astrophysics, is an 7i-body problem 
and can be solved quickly with trees (March et al., 
2012). In addition, Euclidean minimum spanning 
trees can be found quickly using tree-based algorithms 
(March et al., 2010). Other dual-tree algorithms 
include kernel density estimation (Gray & Moore, 
2003a), mean shift (Wang et al., 2007), Gaussian sum- 
mation (Lee & Gray, 2006), kernel density estima- 
tion, fast singular value decomposition (Holmes et al., 
2008), range search, furthest- neighbor search, and 
many others. 

The dual-tree algorithms referenced above are each 
quite similar, but no formal connections between the 
algorithms have been established. The types of trees 
used to solve each problem may differ, and in addition, 
the manner in which the trees are traversed can differ 
(depending on the problem or the tree). In practice, 
a researcher may have to implement entirely separate 
algorithms to solve the same problems with different 
trees; this is time-consuming and makes it difficult to 
explore the properties of tree types that make them 
more suited for particular problems. Worse yet, par- 
allel dual-tree algorithms are difficult to develop and 
appear to be far more complex than serial implemen- 
tations; yet, both solve the same problem. We make 
these contributions to address these shortcomings: 

• A representation of dual-tree algorithms as 
four separate components: a space tree, a 
traversal, a base case, and a pruning rule. 

• A meta-algorithm that produces dual-tree al- 
gorithms, given those four separate components. 

• Base cases and pruning rules for a variety of 
dual-tree algorithms, which can be used with 
any space tree and any traversal. 

• A theoretical framework, used to prove the 
correctness of these meta-algorithms and develop 
a new, tighter bound for fc-nearest neighbor 
search. 

• Implications of our representation, including easy 
creation of large-scale distributed dual-tree 
algorithms via our meta-algorithm. 
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2. Overview of Meta- Algorithm 

In other works, dual-tree algorithms are described as 
standalone algorithms that operate on a query dataset 
S q and a reference dataset S r . By observing common- 
alities in these algorithms, we propose the following 
logical split of any dual-tree algorithm into four parts: 



• A space tree (a type of data structure) . 

• A pruning dual-tree traversal, which visits nodes 
in two space trees, and is parameterized by a 
BaseCaseO and a Score () function. 

• A BaseCaseO function that defines the action to 
take on a combination of points. 

• A Score () function that determines if a subtree 
should be visited during the traversal. 

We can use this to define a meta-algorithm: 

Given a type of space tree, a pruning dual- 
tree traversal, a BaseCaseO function, and 
a Score () function, use the pruning dual- 
tree traversal with the given BaseCaseO and 
Score functions on two space trees 2? q 
(built on S q ) and ■%■ (built on S r ). 

In Sections 3 and 4, space trees, traversals, and related 
quantities are rigorously defined. Then, Sections 5-8 
define BaseCaseO and Score for various dual-tree 
algorithms. Section 9 discusses implications and future 
possibilities, including large-scale parallelism. 

3. Space Trees 

To develop a framework for understanding dual-tree 
algorithms, we must introduce some terminology. 

Definition 1. A space tree on a dataset S G $l NxD 
is an undirected, connected, acyclic, rooted simple 
graph with the following properties: 

• Each node (or vertex), holds a number of points 
(possibly zero) and is connected to one parent node 
and a number of child nodes (possibly zero). 

• There is one node in every space tree with no par- 
ent; this is the root node of the tree. 

• Each point in S is contained in at least one node 
of the tree. 

• Each node JV of the tree has a convex subset of 
$l D that contains each of the points in that node 
as well as the convex subsets represented by each 
child of the node. 

Notationally, we use the following conventions: 
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(a) Abstract representation. (b) 5ft 2 representation. 
Figure 1. An example space tree. 

• The set of child nodes of a node jVi is denoted 
<€{<//?) or ^. 

• The set of points held in a node jVi is denoted 
&(jVi) or 9>i. 

• The convex subset represented by node jVi is de- 
noted y{.yVi) or S"i. 

• The set of descendant nodes of a node jVi, de- 
noted 3> n {-yVi) or Sf , i s the set of nodes tfp^) U 
V(¥(JQ) U . . . . 

• The set of descendant points of a node ,Ai, de- 
noted S) v {jVi) or @f, is the set of points {j>:p6 

f"(4)u^(4)}. 

• The parent of a node JVi is denoted Par(^). 

An abstract representation of an example space tree on 
a five-point dataset in 5ft 2 is shown in Figure 1(a). In 
this illustration, JV T is the root node of the tree; it has 
no parent and it contains the points xj, and X\ (that is, 
2? r = \x\,x^\. The node ,JV V contains points X\ and 
x$ and has children jV c and ,jVd (which each have no 
children and contain points x-i and £4, respectively). 
In Figure 1(b), the points in the tree and the subsets 
Sf r (darker rectangle) and 5^ v (lighter rectangle) are 
plotted. S^c = {^2} and S^d — {xa] are not labeled. 

Definition 2. The minimum distance between two 
nodes ,Ai and ,jVj is defined as 

d min (^i,^j) = min{ \\pi-pj\\ Vp, G Sf, Pj <=&?}■ 

Definition 3. The maximum distance between two 
nodes J/i and ,jVj is defined as 

dmaxi^K, Jfj) = max { \\ Pl - P] \\ V p t G &■ ,p 3 <=&?}. 

Definition 4. The maximum child distance of a 

node jVi is defined as the maximum distance between 
the centroid Ci of S^i and each point in ^: 



pMO 



max IIC, 



■P\ 
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Definition 5. The maximum descendant dis- 
tance of a node JYi is defined as the maximum dis- 
tance between the centroid Ci of S^i and points in S>f : 



AMO 



max \\Ci 



P 



It is straightforward to show that fcd-trees, octrees, 
metric trees, ball trees, cover trees (Beygelzimer et al., 
2006), R-trees, and vantage-point trees all satisfy 
the conditions of a space tree. The quantities 
dmin(yy q ,^<r), rfmax(4,,^), X(J£), and p{J%) are 
easily derived (or bounded, which in many cases is 
sufficient) for each of these types of trees. 

For a A:d-tree, d m i n (^Ai,^fj) is bounded below by the 
minimum distance between J^ and S^f, dm^^Vi, ,j¥j) 
is bounded above similarly by the maximum distance 
between 5?i and ,5^j. Both p{J/i) and \{.vK) are 
bounded above by d max (Ci,^fi). 

For the cover tree (Beygelzimer et al., 2006), each 
node ,jVi contains only one point p% and has 'scale' 
Si- d-mi^^Vij^Vj) is bounded below by d(pi,pj) — 
2 s ^+ l — 2 S J +1 and d m ax(^i,^j) is bounded above by 
d(pi,Pj) + 2 Si+1 + 2 Sj+1 . Because pi is the centroid of 
,?> u p(JfC) = 0. \(jVi) is simply 2 S<+1 . 

4. Tree Traversals 

In general, the nodes of each space tree can be tra- 
versed in a number of ways. However, there have been 
no attempts to formalize tree traversal. Therefore, we 
introduce several definitions which will be useful later. 

Definition 6. A single-tree traversal is a process 
that, given a space tree, will visit each node in that 
tree once and perform a computation on any points 
contained within the node that is being visited. 

As an example, the standard depth-first traversal or 
breadth-first traversal are single-tree traversals. From 
a programming perspective, the computation in the 
single-tree traversal can be implemented with a sim- 
ple callback BaseCase (point) function. This allows 
the computation to be entirely independent of the 
single-tree traversal itself. As an example, a simple 
single-tree algorithm to count the number of points 
in a given tree would increment a counter variable 
each time BaseCase (point) was called. However, this 
concept by itself is not very useful; without pruning 
branches, no computations can be avoided. 

Definition 7. A pruning single-tree traversal is a 

process that, given a space tree, will visit nodes in the 
tree and perform a computation to assign a score to 
that node. If the score is above some bound, the node 
is "pruned" and none of its descendants will be visited; 
otherwise, a computation is performed on any points 



Algorithm 1 DepthFirstTraversaK^, ,yV r ) . 
if Score (jYq, JV^ = oo then return 

for each p q 6 3? q , p r 6 3P r do 
BaseCase (p q , p r ) 

for each jV qc € c € q , JV TC 6 ^ do 
DepthFirstTraversal( t /(^ c , ,jV r< ^j 



contained within that node. If no nodes are pruned, 
then the traversal will visit each node in the tree once. 

Clearly, a pruning single-tree traversal that does not 
prune any nodes is a single-tree traversal. A prun- 
ing single-tree traversal can be implemented with two 
callbacks: BaseCase (point) and Score (node). This 
allows both the point-to-point computation and the 
scoring to be entirely independent of the traversal. 
Thus, single-tree branch-and-bound algorithms can be 
expressed in a tree-independent manner. Extensions 
to the dual-tree case are given below. 

Definition 8. A dual-tree traversal is a process 
that, given two space trees S? q (query tree) and 2? r 
(reference tree), will visit every combination of nodes 
(jVq,JV r ) once, where jV q € 3? q and JV T € ■%. At 
each visit (^jV q ,,jV r ), a computation is performed be- 
tween each point in ,jV q and each point in jV r . 

Definition 9. A pruning dual-tree traversal is a 

process which, given two space trees £? q (query tree) 
and ST r (reference tree), will visit combinations of 
nodes {jVq,jY r ) such that J/ q £ !J q and ,yV r s S? r no 
more than once, and perform a computation to assign 
a score to that combination. If the score is above some 
bound, the combination is pruned and no combinations 
(^V q c^Kc) such that jV qc <E @ q and jV rc <E ^™) will be 
visited; otherwise, a computation is performed between 
each point in jV q and each point in ,jV r . 

Similar to the pruning single-tree traversal, a prun- 
ing dual-tree algorithm can use two callback functions 
BaseCase (p q , p r ) and Score {-jVq, jV r ~) . An exam- 
ple implementation of a depth-first pruning dual-tree 
traversal is given in Algorithm 1. The traversal is 
started on the root of the S? q and the root of 3~ r . 

Algorithm 1 provides only one example of a commonly- 
used pruning dual-tree traversal. Other possibilities 
not explicitly documented here include breadth-first 
traversals and the unique cover tree dual-tree traversal 
described by Beygelzimer et al. (2006), which can be 
adapted to the generic space tree case. 

The rest of this work is devoted to using these concepts 
to represent existing dual-tree algorithms in the four 
parts described in Section 2. 
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5. fc-Nearest Neighbor Search 

fc-nearest neighbor search is a well-studied problem 
with a plethora of algorithms and results. The prob- 
lem can be stated as follows: 



Given a query dataset S, 

ixd 



G5R 



nxd 



a reference dataset 
S r € W nxd , and an integer k : < k < m, for each 
point p q £ S q , find the k nearest neighbors in S r and 
their distances from p q . The list of nearest neighbors 
for a point p q can be referred to as N p and the dis- 
tances to nearest neighbors for p q can be referred to 
as Dp . Thus, the fc-th nearest neighbor to point p q is 
N Pq [k] and D Pq [k} = \\p q -N Pq [k}\\. 

This can be solved using a brute-force approach: com- 
pare every possible point combination and store the 
k smallest distance results for each p q . However, this 
scales poorly - 0(nm); hence the importance of fast 
algorithms to solve the problem. Many existing al- 
gorithms employ tree-based branch-and-bound strate- 
gies (Beygclzimcr et al., 2006; Cover & Hart, 1967; 
Friedman et al., 1977; Fukunaga & Narendra, 1975; 
Gray & Moore, 2001; Ram et al., 2009). 

We unify all of these branch-and-bound strategies by 
defining methods BaseCase(p g , p r ) and Score (jVq, 
jV r ) for use with a pruning dual-tree traversal. 

At the initialization of the tree traversal, the lists N p 
and D p are empty lists for each query point p q . After 
the traversal is complete, for a query point p q , the 
set {N p [1], ...,N p [k]} is the ordered set of k nearest 
neighbors of the point p qi and each D Pq [i] = \\p q — 
Np [i]\\. If we assume that D p [i] = oo if i is greater 
than the length of D p , we can formulate BaseCaseO 
as given in Algorithm 2 . 



With the base case established, only the pruning rule 
remains. A valid pruning rule will, for a given query 
node jVq and reference node jV r , prune the reference 
subtree rooted at jV r if and only if it is known that 
there are no points in &? that are in the set of k nearest 
neighbors of any points in &?. Thus, at any point in 
the traversal, we can prune the combination {^V q ^^Y r ) 
if and only if dminC- 7 ^,^) > Bx(jV q \ where 



Bi(JK q ) 



max D v \k\ . 



Now, we can describe this bound recursively. This 
is important for implementation; a recursive function 



1 In practice, fc-nearest-neighbors is often run with iden- 
tical reference and query sets. In that situation it may be 
useful to modify this implementation of BaseCaseO so that 
a point does not return itself as the nearest neighbor (with 
distance 0). 



Algorithm 2 fc-nearest-neighbors BaseCaseO 

Input: query point p q , reference point p r , list of k 
nearest candidate points N p and k candidate dis- 
tances Dp (both ordered by ascending distance) 
Output: distance d between p q and p r 

d ^- \\Pq ~Pr\\ 

if d < Dp [k] and BaseCase (p,j , p r ) not yet called 
then 

insert d into ordered list D p and truncate list to 

length k 

insert p r into N p such that N p is ordered by 

distance and truncate list to length k 
return d 



can cache previous calculations for large speedups. 



BiM£) = 



max { max D v \k] , max 



D p [k}} 



= max { max D v \k] , max { max D v \k] } ) 

= max { max D n \k], max Bit^AC)} 
l pe& q pl J './K c e^ v n 



Suppose we have, at some point in the traversal, two 
points po,Pi 6 &% for some node jV q , with D pa [k] = oo 
and D pi [k] < oo. This means there exist k points 
{pl, . . . ,Pr} in S r such that d(pi,p r ) < D Pl [k] for i = 
{1, . . . , k}. Because po,pi G ^?, we can apply the tri- 
angle inequality to see that d(p , p\) < 2\{jV q ). There- 
fore, d(p ,p r ) < D pi [k] + 2\{J/ q ) for i = {1, . . . , k}. 
Using this observation we can construct an alternate 
bound function B 2 {jy q ): 



B 2 {,yY q ) = min DJk] + 2A(~0 

p6®£ 

which can, like Bi(jV q ), be rearranged to provide a 
recursive definition. In addition, if po £ & q and p\ £ 
@p, we can bound d(po 7 pi) more tightly with p{jVq) + 
X(tAq) instead of 2\(</V q ). These observations yield 



B 2 {JT q ) = min <^ min (D p [k] + p(jK q ) + X(jK q )), 



min (B 2 (^ c ) + 2(A(^) - \(JQ) 

:y>c£ vq 



Both Bi{,Aq) and I?2 («/♦£) provide valid pruning rules. 
We can combine both to get a tighter pruning rule 
by taking the tighter of the two bounds. In addi- 
tion, B r {jV q ) > B X {J/ C ) and B 2 (jV q ) > B 2 {J/ C ) for 
all ,JV C £ c € q . Therefore, we can prune (.jVq,jV r ) if 
dminO^,^) > min{B 1 (Par(^)),S 2 (Par(^))}. 
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Algorithm 3 fc-nearest-neighbors Score () 
Input: query node jV q , reference node J/ r 
Output: a score for the node combination 
{JVq, -yK), or oo if the combination should be pruned 

if d min (<Aq,<Ar) < B(^r q ) then 
return d min (,yV q , _sV r ) 

return oo 



These observations are combined for a better bound: 

B(jYa) — min < max { max D v \k], max B(,yV c )\, 
min {D p [k]+p{^ q ) + \{,vV q )) 1 

min (B(,yK) + 2(X(,yr q )-X(^ c )) 
B(Par(^)) 



As a result of this bound function being expressed re- 
cursively, previous bounds can be cached and used to 
calculate the bound B(^V q ) quickly. We can use this 
to structure Score () as given in Algorithm 3. 

Applying the meta-algorithm in Section 2 with 
any tree type and any pruning dual-tree traversal 
gives a correct implementation of fc-nearest neighbor 
search. Proving the correctness is straightforward; 
first, a (non-pruning) dual-tree traversal which uses 
BaseCaseO as given in Algorithm 2 will give correct 
results for any space tree. Then, we already know that 
B{jV q ) is a bounding function that, at any point in 
the traversal, will not prune any subtrees which could 
contain better nearest neighbor candidates than the 
current candidates. Thus, the true nearest neighbors 
for each query point will always be visited, and the 
results will be correct. 

We now show that this algorithm is a generalization 
of the standard fc<i-tree fc-NN search, which uses a 
pruning dual-tree depth-first traversal. The archety- 
pal algorithm for all-nearest neighbor search (fc-nearest 
neighbor search with k — 1) given for fcd-trees in Alex 
Gray's Ph.D. thesis (2003) is shown here in Algorithm 
4 with converted notation. 5 q n is the bound for a node 
jV q and is initialized to oo; D p represents the nearest 
distance for a query point p q , and N Pq represents the 
nearest neighbor for a query point p q . -yV q .\eit repre- 
sents the left child of ,jV q and is defined to be jV q if JV q 
has no children; ,jY q . right is similarly defined. 

The structure of the algorithm matches Algorithm 1; 
it is a dual-tree depth-first recursion. Because this is 



Algorithm 4 AllNN(,y^, ,yV r ) (Gray, 2003) 

if d m in(^,^) > S q n , then return 
if jV q is leaf and JV T is leaf then 



for each p q e SP^Pr & S?r do 



L qr 



\\Pq-Pr\\- 



if d qr < D Pq then D Pq 



if d qr < 8™ then 6™ <- d qr 



d q r] ±\p q 



Pr 



AllNNC^.left, closer-ofp^.left, jV r . right)) 
AllNNC^.left, farther-of(^.left, ^.right)) 
A11NN(^. right, closer-of(^.left, Jf r , right)) 
A11NN(^. right, farther-of(^.left, ^.right)) 
<5™ = min( ( 5»",max(^ ft , ( 5^ ight )) 



a depth-first recursion, S q n = oo for a node jV q if no 
descendants of jV q have been recursed into. Otherwise, 
5 q n is the maximum of D Pq for all p q 6 &?. That 
is, S q n = Bi(jVq). Thus, the comparison in the first 
line of Algorithm 4 is equivalent to Algorithm 3 with 
B^Aq) instead of B(,vV q ). 

The first two lines of the inside of the for each loop 
(the base case) are equivalent to Algorithm 2 with fc = 
1. kd-trees only hold points in leaves; therefore, the 
base case is called for all combinations of points in 
each node combination, identically to the depth-first 
traverser (Algorithm 1). 

A fcd-tree is a space tree and the dual depth-first re- 
cursion is a pruning dual-tree traversal. Also, we 
showed the equivalency of the pruning rule (that is, 
the Score () function) and the equivalency of the base 
case. So, it is clear that Algorithm 4 is produced using 
our meta-algorithm with these parameters. In addi- 
tion, because B{^Y q ) is always less than B\{,jV q "), Al- 
gorithm 3 provides a tighter bound than the pruning 
rule in Algorithm 4. 

This algorithm is also a generalization of the standard 
cover tree fc-NN search (Beygelzimer et al., 2006). 
The cover tree search is a pruning dual-tree traver- 
sal where the query tree is traversed depth-first while 
the reference tree is simultaneously traversed breadth- 
first. The pruning rule (after simple adaptation to 
the fc-nearest-neighbor search problem instead of the 
nearest-neighbor search problem) is equivalent to 

4mK,4) > D pq [k] + \{JQ 

where p q is the point contained in jV q (remember, each 
node of a cover tree contains one point). This is equiva- 
lent to B2{^V q ) because pi^Kj) — for cover trees. The 
transformation from the algorithm given by Beygelz- 
imer et al. (2006) to our representation is made clear 
in Appendix A (supplementary material) and in the 
fc-nearest neighbor search implementation of the C++ 
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library MLPACK (Curtin et al., 2011); this is imple- 
mented in terms of our meta-algorithm. 

Specific algorithms for ball trees, metric trees, VP 
trees, octrees, and other space trees are trivial to cre- 
ate using the BaseCaseO and Score () implementa- 
tion given here (and in MLPACK). Note also that this 
implementation will work in any metric space. 

An extension to fc-furthest neighbor search is straight- 
forward. The bound function must be 'inverted' by 
changing 'max' to 'min' (and vice versa); in addition, 
the distances D Pq [i] must be initialized to instead of 
oo, and the lists D and N must be sorted by descend- 
ing distance instead of ascending distance. Lastly, the 
comparison d < D Pq [k] must be changed to d > D Pq [k] . 
With these simple changes, we have solved an entirely 
different problem using our meta-algorithm with very 
little effort. A k- furthest neighbor search using our 
meta-algorithm for both fcd-trees and cover trees is 
also available in MLPACK. 

6. Range Search 

Range search is another popular neighbor searching 
problem related to fc-nearest neighbor search. In addi- 
tion to being a fairly standard machine learning task, 
it has numerous uses in applications such as databases 
and geographic information systems (CIS). A treatise 
on the history of the problem and solutions is given by 
Agarwal & Erickson (1999). The problem is: 

Given query and reference datasets S q , S r and a range 
[<5ij ^2], for each point p q 6 S q , find all points in S r 
such that Si < \\p q — p r \\ < 62- As with fc-nearest 
neighbor search, refer to the list of neighbors for each 
query point p q as N Pq and the corresponding distances 
as Dp . These lists are not sorted in any particular 
order, and at initialization time, they are empty. 

In different settings, the problem of range search may 
not be stated identically; however, our results are eas- 
ily adaptable. A BaseCaseO implementation is given 
in Algorithm 5, and a Score () implementation is given 
in Algorithm 6. The only bounds to consider are 
[5i , #2] , so no complex bound handling is necessary. 

While range search is sometimes mentioned in the con- 
text of dual-tree algorithms (Gray & Moore, 2001), 
the focus is usually on fc-nearest neighbor search. 
As a result, we cannot find any explicitly published 
dual-tree algorithms to generalize; however, a single- 
tree algorithm was proposed by Bentley and Fried- 
man (1979). Thus, the BaseCaseO and Score pro- 
posed here can be used with our meta-algorithm to 
produce entirely novel range search implementations; 
MLPACK has kd-tiee and cover tree implementations. 



Algorithm 5 Range search BaseCaseO. 

Input: query point p q , reference point p r , neighbor 

list N p , distance list D p 

Output: distance d between p q and p r 

d ^- \\Pq ~Pr\\ 

if Si < d < 82 and BaseCase(p g , p r ) not yet called 
then 

N p q <~ Np, U {Pr} 

D P q <- D P q u M 
return d 



Algorithm 6 Range search Score (). 

Input: query node jY q , reference node jV r 
Output: a score for {jY q ,jY r ), or 00 if the combi- 
nation should be pruned 

if Si < d min {Jf q , jV r ) < 5 2 then 

return d min {jV q , jV r ) 
return 00 



7. Boruvka's Algorithm 

Finding a Euclidean minimum spanning tree has 
been a relevant problem since Boruvka's algorithm 
was proposed in 1926. Recently, a dual-tree version 
of Boruvka's algorithm was developed (March et al., 
2010) for kd-tiees and cover trees. We unify these two 
algorithms and generalize to other types of space tree 
by formulating BaseCaseO and Score functions. 

For a dataset S r € K 7VxD , Boruvka's algorithm con- 
nects each point to its nearest neighbor, giving many 
'components'. For each component c, the nearest 
point in S r to any point of c that is not part of 
c is found. The points are connected, combining 
those components. This process repeats until only one 
component — the minimum spanning tree — remains. 

During the algorithm, we maintain a list F made up 
of i components Fi : {Ei,Vi} where E{ is the list of 
edges and Vt is the list of vertices in the component 
Fi (these are points in S r ). Each point in S r belongs 
to only one Fi. At initialization, \F\ = \S r \ and Fi = 
{0, {pi}} for i = {1, . . . , |SV|}, where ^ is the z'th point 
in S r . For p £ S r we define F(p) = Fi if Fi is the 
component containing p. During the algorithm, we 
maintain N(Fi) as the candidate nearest neighbor of 
component Fi and p c {Fi) as the point in component 
F t nearest to N{F t ). Then, D(F t ) = \\p (Fi)-N(Fi)\\. 
Remember that F(iV(i ;l l )) ^ F t . 

To run Boruvka's algorithm with a space tree 2? r built 
on the set of points SV, a pruning dual-tree traversal is 
run with BaseCaseO as Algorithm 7, Score as Al- 
gorithm 8, £F r as both of the trees, and F as initialized 
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Algorithm 7 Boruvka's algorithm BaseCaseO. 
Input: query point p q , reference point p r , nearest 
candidate point N(F(p q )) and distance D(F(p q )) 
Output: distance d between p q and p r 

if p q = p r then 
return 

if F(p q ) ^ F{ Pr ) and \\p q - p r \\ < D(F(p q )) then 

D{F{p q ))^\\p q -p r \\ 

N(F{ Pq ))^ Pr ; Pc (F(p q ))^p q 
return ||p 9 -p r || 



Algorithm 8 Boruvka's algorithm Score (). 
Input: query node jY q , reference node jV r 
Output: a score for the node combination 
{^Aq, -yK), or oo if the combination should be pruned 

if dmini^q,^) < B{jV q ) then 

if F( Pq ) = F(p r ) V Pq G 9$,pr G S>p then 

return oo 
return d min (,jV q , _sV r ) 

return oo 



before. Note that Score () uses B(jY q ) from Section 5 
with k = 1. Upon traversal completion, we have a list 
N(Fi) of nearest neighbors of each component Fi. The 
edge (N(F l ),p c (F l )) is added to Fi for each Fi. Then, 
any components in F with shared edges are merged 
into a new list F' where \F'\ < \F\. The pruning dual- 
tree traversal is then run again with F = F' and the 
traversal-merge process repeats until \F\ = 1. When 
\F\ = 1, then F\ is the minimum spanning tree of S r . 

To prove the correctness of the meta-algorithm, see 
Theorem 4.1 in March et al. (2010). That proof can 
be adapted from fed-trees to general space trees. Our 
representation is a generalization of their algorithms; 
our meta-algorithm to produces their fcd-tree and cover 
tree implementations with a tighter distance bound 
B{jV q ). Our meta-algorithm produces a provably cor- 
rect dual-tree algorithm with any type of space tree. 

8. Kernel Density Estimation 

Much work has been produced regarding the use 
of dual-tree algorithms for kernel density estimation 
(KDE), including by Gray & Moore (2001; 2003b) and 
later by Lee et al. (2006; 2008). KDE is an important 
machine learning problem with a vast range of applica- 
tions, such as signal processing to econometrics. The 
problem is, given query and reference sets S q ,S r , to es- 
timate a probability density f q at each point p q G S q 
using each point p r G S r and a kernel function Ky L . 
The exact probability density at a point p q is the sum 
of K(\\p q — Pr\\) for all p r G S r . 



In general, the kernel function is some zero-centered 
probability density function, such as a Gaussian. This 
means that when \\p q — p r \\ is very large, the contri- 
bution of K to f q is very small. Therefore, we can 
approximate small values using a dual-tree algorithm 
to avoid unnecessary computation; this is the idea set 
forth by Gray & Moore (2001). Because K is a func- 
tion which is decreasing with distance, the maximum 
difference between K values for a given combination 
{,yV q , JK) can be bounded above with 
B K {jV q ,JK-) = K(d m ^ qi ,yV r )) - K(d m U-^^r)). 
The algorithm takes a parameter e; when B^{y¥ q , ,y¥ r ) 
is less than e/|SV|, the kernel values are approximated 
using the kernel value of the centroid C r of the refer- 
ence node. The division by \S r \ ensures that the total 
approximation error is bounded above by e. The base 
case on p q and p r merely needs to add i^(||p ? — p r ||) to 
the existing density estimate f q . When the algorithm 
is initialized, f q — for all query points. BaseCaseO 
is Algorithm 10 and Score () is Algorithm 9. 

Again we emphasize the flexibility of our meta- 
algorithm. To our knowledge cover trees, octrees, and 
ball trees have never been used to perform KDE in 
this manner. Our meta-algorithm can produce these 
implementations with ease. 

9. Discussions 

We have now shown five separate algorithms for which 
we have taken existing dual-tree algorithms and con- 
structed a BaseCaseO and Score function that can 
be used with any space tree and any dual-tree traver- 
sal. Single-tree extensions of these four methods are 
straightforward simplifications. 

Algorithm 9 KDE Score (Jf q , jV r ) . 

Input: query node jV q , reference node jV r 
Output: a score for (,yVq, jV r ) or oo if the combina- 
tion should be pruned 

if B K {Jf q ,JQ > jg-r then 
for each p q G @£ do 

fq^fq+\S)P\K(\\p q -C r \\) 

return oo 

return d min (N q ,N r ) 



Algorithm 10 KDE BaseCase(p g , p r ) . 

Input: query point p q , reference point p r , density 

estimate f q 

Output: distance between p q and p r 

if BaseCase(p g , p r ) already called then return 

fq ^ fq + K h (\\ P q - Pr\\) 

return ||p 9 -p r || 
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This modular way of viewing tree-based algorithms 
has several useful immediate applications. The first 
is implementation. Given a tree implementation and 
a dual-tree traversal implementation, all that is re- 
quired is BaseCaseO and Score () functions. Thus, 
code reuse can be maximized, and new algorithms can 
be implemented simply by writing two new functions. 
More importantly, the code is now modular. MLPACK 
(Curtin et al., 2011), written in C++, uses templates 
for this. One example is the DualTreeBoruvka class, 
which implements the meta-algorithm discussed in 
Section 7, and has the following arguments: 
template<typename MetricType, 
typename TreeType, 
typename TraversalType> 
class DualTreeBoruvka; 

This means that any class satisfying the constraints 
of the TreeType template parameter can be de- 
signed without any consideration or knowledge of 
the DualTreeBoruvka class or of the TraversalType 
class; it is entirely independent. Then, assuming a 
TreeType and TraversalType without bugs, the dual- 
tree Boruvka's algorithm is guaranteed to work. An 
immediate example of the advantage of this is that 
cover trees were implemented for MLPACK for use 
with fc-nearest neighbor search. This cover tree imple- 
mentation could, without any additional work, be used 
with DualTreeBoruvka — which was never an intended 
goal during the cover tree implementation but still a 
particularly valuable result! 

Of course, the utility of these abstractions are not lim- 
ited implementation details. Each of the papers cited 
in the previous sections describe algorithms in terms 
of one specific tree structure. March et al. (2010) 
discuss implementations of Boruvka's Algorithm on 
both fcd-trees and cover trees and give algorithms for 
both. Each algorithm given is quite different and it 
is not easy to see their similarities. Using our meta- 
algorithm, any of these tree-based algorithms can be 
expressed with less effort — especially for more complex 
trees like the cover tree — and in a more general sense. 

In addition, correctness proofs for our algorithms tend 
to be quite simple. The proofs for each algorithm here 
can be given in two simple sub-proofs: (1) prove the 
correctness of BaseCaseO when no prunes are made, 
and (2) prove that Score () does not prune any sub- 
trees which the correctness of the results depends on. 

The logical split of base case, pruning rule, tree type, 
and traversal can also be advantageous. A strong ex- 
ample of this is the function B(jVq) devised in Section 
5, which is a novel, tighter bound. When not consider- 
ing a particular tree, the path to a superior algorithm 
can often be simpler (as in that case). 



9.1. Parallelism 

Nowhere in this paper has parallelism been discussed 
in any detail. In fact, all of the given algorithms seem 
to be suited to serial implementation. However, the 
pruning dual-tree traversal is entirely separate from 
the rest of the dual-tree algorithm; therefore, a par- 
allel pruning dual-tree traversal can be used without 
modifying the rest of the algorithm. 

For instance, consider fc-nearest neighbor search. Most 
large-scale parallel implementations of fc-NN do not 
use space trees but instead techniques like LSH for 
fast (but inexact) search. To our knowledge, no freely 
available software exists that implements distributed 
dual-tree fc-nearest neighbor search. 

As a simple (and not necessarily efficient) proof-of- 
concept idea for a distributed traversal, suppose we 
have t 2 machines and a "master" machine for some 
t > 0. Then, for a query tree S? q and a reference tree 
5?" r , we can split 2? q into t subtrees and one "master 
tree" 5q m . The reference tree ,% is split the same 
way. Each possible combination of query and reference 
subtrees is stored on one of the t 2 machines, and the 
master trees are stored on the master machine. The 
lists D and N can be stored on the master machine 
and can be updated or queried by other machines. 

The traversal starts at the roots of the query tree 
and reference tree and proceeds serially on the mas- 
ter. When a combination in two subtrees is reached, 
Score () and BaseCaseO are performed on the ma- 
chine containing those two subtrees and that subtree 
traversal continues in parallel. This idea satisfies the 
conditions of a pruning dual-tree traversal; thus, we 
can use it to make any dual-tree algorithm parallel. 

Recently, a distributed dual-tree algorithm was devel- 
oped for kernel summation (Lee et al., 2012); this work 
could be adapted to a generalized distributed pruning 
dual-tree traversal for use with our meta-algorithm. 

10. Conclusion 

We have proposed a tree-independent representation 
of dual-tree algorithms and a meta-algorithm which 
can be used to create these algorithms. A dual-tree 
algorithm is represented as four parts: a space tree, 
a pruning dual-tree traversal, a BaseCaseO function, 
and a Score function. We applied this representa- 
tion to generalize and extend five example dual-tree 
algorithms to different types of trees and traversals. 
During this process, we also devised a novel bound for 
^-nearest neighbor search that is tighter than existing 
bounds. Importantly, this abstraction can be applied 
to help approach the problem of parallel dual-tree al- 
gorithms, which currently is not well researched. 
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